AITopics | relational table

Collaborating Authors

relational table

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Graph-Based Feature Augmentation for Predictive Tasks on Relational Datasets

Qiao, Lianpeng, Cao, Ziqi, Feng, Kaiyu, Yuan, Ye, Wang, Guoren

arXiv.org Artificial IntelligenceAug-29-2025

Data has become a foundational asset driving innovation across domains such as finance, healthcare, and e-commerce. In these areas, predictive modeling over relational tables is commonly employed, with increasing emphasis on reducing manual effort through automated machine learning (AutoML) techniques. This raises an interesting question: can feature augmentation itself be automated and identify and utilize task-related relational signals? To address this challenge, we propose an end-to-end automated feature augmentation framework, ReCoGNN, which enhances initial datasets using features extracted from multiple relational tables to support predictive tasks. ReCoGNN first captures semantic dependencies within each table by modeling intra-table attribute relationships, enabling it to partition tables into structured, semantically coherent segments. It then constructs a heterogeneous weighted graph that represents inter-row relationships across all segments. Finally, ReCoGNN leverages message-passing graph neural networks to propagate information through the graph, guiding feature selection and augmenting the original dataset. Extensive experiments conducted on ten real-life and synthetic datasets demonstrate that ReCoGNN consistently outperforms existing methods on both classification and regression tasks.

artificial intelligence, machine learning, tuple, (16 more...)

arXiv.org Artificial Intelligence

2508.20986

Genre: Research Report > New Finding (0.93)

Industry:

Banking & Finance (0.46)
Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Generating Synthetic Relational Tabular Data via Structural Causal Models

Hoppe, Frederik, Franz, Astrid, Kleinemeier, Lars, Göbel, Udo

arXiv.org Artificial IntelligenceJul-8-2025

Synthetic tabular data generation has received increasing attention in recent years, particularly with the emergence of foundation models for tabular data. The breakthrough success of TabPFN (Hollmann et al.,2025), which leverages vast quantities of synthetic tabular datasets derived from structural causal models (SCMs), demonstrates the critical role synthetic data plays in developing powerful tabular foundation models. However, most real-world tabular data exists in relational formats spanning multiple interconnected tables - a structure not adequately addressed by current generation methods. In this work, we extend the SCM-based approach by developing a novel framework that generates realistic synthetic relational tabular data including causal relationships across tables. Our experiments confirm that this framework is able to construct relational datasets with complex inter-table dependencies mimicking real-world scenarios.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.03528

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Germany > Bremen > Bremen (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting

Shrestha, Gyanendra, Jiang, Chutain, Akula, Sai, Yannam, Vivek, Pyayt, Anna, Gubanov, Michael

arXiv.org Artificial IntelligenceFeb-19-2025

Embeddings serve as condensed vector representations for real-world entities, finding applications in Natural Language Processing (NLP), Computer Vision, and Data Management across diverse downstream tasks. Here, we introduce novel specialized embeddings optimized, and explicitly tailored to encode the intricacies of complex 2-D context in tables, featuring horizontal, vertical hierarchical metadata, and nesting. To accomplish that we define the Bi-dimensional tabular coordinates, separate horizontal, vertical metadata and data contexts by introducing a new visibility matrix, encode units and nesting through the embeddings specifically optimized for mimicking intricacies of such complex structured data. Through evaluation on 5 large-scale structured datasets and 3 popular downstream tasks, we observed that our solution outperforms the state-of-the-art models with the significant MAP delta of up to 0.28. GPT-4 LLM+RAG slightly outperforms us with MRR delta of up to 0.1, while we outperform it with the MAP delta of up to 0.42.

dataset, metadata, relational table, (14 more...)

arXiv.org Artificial Intelligence

2502.15819

Country:

North America > United States > Florida > Hillsborough County > Tampa (0.14)
North America > United States > Florida > Leon County > Tallahassee (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.68)
Education (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Coresets for Relational Data and The Applications Jiaxiang Chen 1 Qingyuan Yang Ruomin Huang

Neural Information Processing SystemsFeb-18-2024, 05:21:54 GMT

A coreset is a small set that can approximately preserve the structure of the original input data set. Therefore we can run our algorithm on a coreset so as to reduce the total computational complexity. Conventional coreset techniques assume that the input data set is available to process explicitly. However, this assumption may not hold in real-world scenarios. In this paper, we consider the problem of coresets construction over relational data. Namely, the data is decoupled into several relational tables, and it could be very expensive to directly materialize the data matrix by joining the tables. We propose a novel approach called "aggregation tree with pseudo-cube" that can build a coreset from bottom to up. Moreover, our approach can neatly circumvent several troublesome issues of relational learning problems [Khamis et al., PODS 2019]. Under some mild assumptions, we show that our coreset approach can be applied for the machine learning tasks, such as clustering, logistic regression and SVM.

algorithm, coreset, relational data, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Asia > China (0.04)
South America > Uruguay > Montevideo > Montevideo (0.04)
(17 more...)

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Observatory: Characterizing Embeddings of Relational Tables

Cong, Tianji, Hulsebos, Madelon, Sun, Zhenjie, Groth, Paul, Jagadish, H. V.

arXiv.org Artificial IntelligenceJan-27-2024

Language models and specialized table embedding models have recently demonstrated strong performance on many tasks over tabular data. Researchers and practitioners are keen to leverage these models in many new application contexts; but limited understanding of the strengths and weaknesses of these models, and the table representations they generate, makes the process of finding a suitable model for a given task reliant on trial and error. There is an urgent need to gain a comprehensive understanding of these models to minimize inefficiency and failures in downstream usage. To address this need, we propose Observatory, a formal framework to systematically analyze embedding representations of relational tables. Motivated both by invariants of the relational data model and by statistical considerations regarding data distributions, we define eight primitive properties, and corresponding measures to quantitatively characterize table embeddings for these properties. Based on these properties, we define an extensible framework to evaluate language and table embedding models. We collect and synthesize a suite of datasets and use Observatory to analyze nine such models. Our analysis provides insights into the strengths and weaknesses of learned representations over tables. We find, for example, that some models are sensitive to table structure such as column order, that functional dependencies are rarely reflected in embeddings, and that specialized table embedding models have relatively lower sample fidelity. Such insights help researchers and practitioners better anticipate model behaviors and select appropriate models for their downstream tasks, while guiding researchers in the development of new models.

arXiv.org Artificial Intelligence

2310.07736

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(15 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Relational Deep Learning: Graph Representation Learning on Relational Databases

Fey, Matthias, Hu, Weihua, Huang, Kexin, Lenssen, Jan Eric, Ranjan, Rishabh, Robinson, Joshua, Ying, Rex, You, Jiaxuan, Leskovec, Jure

arXiv.org Artificial IntelligenceDec-7-2023

Much of the world's most valued data is stored in relational databases and data warehouses, where the data is organized into many tables connected by primary-foreign key relations. However, building machine learning models using this data is both challenging and time consuming. The core problem is that no machine learning method is capable of learning on multiple tables interconnected by primary-foreign key relations. Current methods can only learn from a single table, so the data must first be manually joined and aggregated into a single training table, the process known as feature engineering. Feature engineering is slow, error prone and leads to suboptimal models. Here we introduce an end-to-end deep representation learning approach to directly learn on data laid out across multiple tables. We name our approach Relational Deep Learning (RDL). The core idea is to view relational databases as a temporal, heterogeneous graph, with a node for each row in each table, and edges specified by primary-foreign key links. Message Passing Graph Neural Networks can then automatically learn across the graph to extract representations that leverage all input data, without any manual feature engineering. Relational Deep Learning leads to more accurate models that can be built much faster. To facilitate research in this area, we develop RelBench, a set of benchmark datasets and an implementation of Relational Deep Learning. The data covers a wide spectrum, from discussions on Stack Exchange to book reviews on the Amazon Product Catalog. Overall, we define a new research area that generalizes graph machine learning and broadens its applicability to a wide set of AI use cases.

graph, learning, relational database, (12 more...)

arXiv.org Artificial Intelligence

2312.04615

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
North America > United States > Illinois (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

API-Assisted Code Generation for Question Answering on Varied Table Structures

Cao, Yihan, Chen, Shuyi, Liu, Ryan, Wang, Zhiruo, Fried, Daniel

arXiv.org Artificial IntelligenceOct-23-2023

A persistent challenge to table question answering (TableQA) by generating executable programs has been adapting to varied table structures, typically requiring domain-specific logical forms. In response, this paper introduces a unified TableQA framework that: (1) provides a unified representation for structured tables as multi-index Pandas data frames, (2) uses Python as a powerful querying language, and (3) uses few-shot prompting to translate NL questions into Python programs, which are executable on Pandas data frames. Furthermore, to answer complex relational questions with extended program functionality and external knowledge, our framework allows customized APIs that Python programs can call. We experiment with four TableQA datasets that involve tables of different structures -- relational, multi-table, and hierarchical matrix shapes -- and achieve prominent improvements over past state-of-the-art systems. In ablation studies, we (1) show benefits from our multi-index representation and APIs over baselines that use only an LLM, and (2) demonstrate that our approach is modular and can incorporate additional APIs.

computational linguistic, dataset, table structure, (14 more...)

arXiv.org Artificial Intelligence

2310.14687

Country:

North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples

Li, Peng, He, Yeye, Yan, Cong, Wang, Yue, Chaudhuri, Surajit

arXiv.org Artificial IntelligenceAug-9-2023

Relational tables, where each row corresponds to an entity and each column corresponds to an attribute, have been the standard for tables in relational databases. However, such a standard cannot be taken for granted when dealing with tables "in the wild". Our survey of real spreadsheet-tables and web-tables shows that over 30% of such tables do not conform to the relational standard, for which complex table-restructuring transformations are needed before these tables can be queried easily using SQL-based analytics tools. Unfortunately, the required transformations are non-trivial to program, which has become a substantial pain point for technical and non-technical users alike, as evidenced by large numbers of forum questions in places like StackOverflow and Excel/Power-BI/Tableau forums. We develop an Auto-Tables system that can automatically synthesize pipelines with multi-step transformations (in Python or other languages), to transform non-relational tables into standard relational forms for downstream analytics, obviating the need for users to manually program transformations. We compile an extensive benchmark for this new task, by collecting 244 real test cases from user spreadsheets and online forums. Our evaluation suggests that Auto-Tables can successfully synthesize transformations for over 70% of test cases at interactive speeds, without requiring any input from users, making this an effective tool for both technical and non-technical users to prepare data for analytics.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2307.14565

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (0.82)
Workflow (0.68)

Technology:

Information Technology > Software (1.00)
Information Technology > Databases (1.00)
Information Technology > Data Science > Data Mining (1.00)
(4 more...)

Add feedback

A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries

Kudva, Prabhakar, Bordawekar, Rajesh, Nitsure, Apoorva

arXiv.org Artificial IntelligenceMar-1-2023

AI-Powered database (AI-DB) is a novel relational database system that uses a self-supervised neural network, database embedding, to enable semantic SQL queries on relational tables. In this paper, we describe an architecture and implementation of in-database interpretability infrastructure designed to provide simple, transparent, and relatable insights into ranked results of semantic SQL queries supported by AI-DB. We introduce a new co-occurrence based interpretability approach to capture relationships between relational entities and describe a space-efficient probabilistic Sketch implementation to store and process co-occurrence counts. Our approach provides both query-agnostic (global) and query-specific (local) interpretabilities. Experimental evaluation demonstrate that our in-database probabilistic approach provides the same interpretability quality as the precise space-inefficient approach, while providing scalable and space efficient runtime behavior (up to 8X space savings), without any user intervention.

machine learning, natural language, sketch, (20 more...)

arXiv.org Artificial Intelligence

2302.12178

Country:

North America > United States > California (0.14)
North America > United States > Virginia (0.05)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre:

Overview (0.93)
Research Report (0.83)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance (1.00)
Transportation (0.68)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

Filters

Collaborating Authors

relational table

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

029f82afd78288059dc946b105c451fd-Paper-Conference.pdf

Graph-Based Feature Augmentation for Predictive Tasks on Relational Datasets

Generating Synthetic Relational Tabular Data via Structural Causal Models

Tabular Embeddings for Tables with Bi-Dimensional Hierarchical Metadata and Nesting

Coresets for Relational Data and The Applications Jiaxiang Chen 1 Qingyuan Yang Ruomin Huang

Observatory: Characterizing Embeddings of Relational Tables

Relational Deep Learning: Graph Representation Learning on Relational Databases

API-Assisted Code Generation for Question Answering on Varied Table Structures

Auto-Tables: Synthesizing Multi-Step Transformations to Relationalize Tables without Using Examples

A Scalable Space-efficient In-database Interpretability Framework for Embedding-based Semantic SQL Queries